ensemble layer
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
The Diversified Ensemble Neural Network
Ensemble is a general way of improving the accuracy and stability of learning models, especially for the generalization ability on small datasets. Compared with tree-based methods, relatively less works have been devoted to an in-depth study on effective ensemble design for neural networks. In this paper, we propose a principled ensemble technique by constructing the so-called diversified ensemble layer to combine multiple networks as individual modules. We theoretically show that each individual model in our ensemble layer corresponds to weights in the ensemble layer optimized in different directions. Meanwhile, the devised ensemble layer can be readily integrated into popular neural architectures, including CNNs, RNNs, and GCNs. Extensive experiments are conducted on public tabular datasets, images, and texts. By adopting weight sharing approach, the results show our method can notably improve the accuracy and stability of the original neural networks with ignorable extra time and space overhead.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
Reviewer # 1
The Eqs. (3) and (4) do relate to the diversity We will emphasize this fact in our final version. Individual model must be weak? Our approach allows some space to control the complexity of each leaner and the generalization ability on new dataset. The applicability can also be justified considering the use of the GPU power. The accuracy in Table 3 in our paper is the mean of 10 trials.
The Diversified Ensemble Neural Network
Ensemble is a general way of improving the accuracy and stability of learning models, especially for the generalization ability on small datasets. Compared with tree-based methods, relatively less works have been devoted to an in-depth study on effective ensemble design for neural networks. In this paper, we propose a principled ensemble technique by constructing the so-called diversified ensemble layer to combine multiple networks as individual modules. We theoretically show that each individual model in our ensemble layer corresponds to weights in the ensemble layer optimized in different directions. Meanwhile, the devised ensemble layer can be readily integrated into popular neural architectures, including CNNs, RNNs, and GCNs.
Robust Natural Language Understanding with Residual Attention Debiasing
Wang, Fei, Huang, James Y., Yan, Tianyi, Zhou, Wenxuan, Chen, Muhao
Natural language understanding (NLU) models often suffer from unintended dataset biases. Among bias mitigation methods, ensemble-based debiasing methods, especially product-of-experts (PoE), have stood out for their impressive empirical success. However, previous ensemble-based debiasing methods typically apply debiasing on top-level logits without directly addressing biased attention patterns. Attention serves as the main media of feature interaction and aggregation in PLMs and plays a crucial role in providing robust prediction. In this paper, we propose REsidual Attention Debiasing (READ), an end-to-end debiasing method that mitigates unintended biases from attention. Experiments on three NLU tasks show that READ significantly improves the performance of BERT-based models on OOD data with shortcuts removed, including +12.9% accuracy on HANS, +11.0% accuracy on FEVER-Symmetric, and +2.7% F1 on PAWS. Detailed analyses demonstrate the crucial role of unbiased attention in robust NLU models and that READ effectively mitigates biases in attention. Code is available at https://github.com/luka-group/READ.
- North America > United States > California (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
On Deep Ensemble Learning from a Function Approximation Perspective
Zhang, Jiawei, Cui, Limeng, Gouza, Fisher B.
In this paper, we propose to provide a general ensemble learning framework based on deep learning models. Given a group of unit models, the proposed deep ensemble learning framework will effectively combine their learning results via a multilayered ensemble model. In the case when the unit model mathematical mappings are bounded, sigmoidal and discriminatory, we demonstrate that the deep ensemble learning framework can achieve a universal approximation of any functions from the input space to the output space. Meanwhile, to achieve such a performance, the deep ensemble learning framework also impose a strict constraint on the number of involved unit models. According to the theoretic proof provided in this paper, given the input feature space of dimension d, the required unit model number will be 2d, if the ensemble model involves one single layer. Furthermore, as the ensemble component goes deeper, the number of required unit model is proved to be lowered down exponentially.
- North America > United States > Florida > Hillsborough County > University (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Beijing > Beijing (0.04)